Kinetic Activation Relaxation Technique 
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We present a detailed description of the kinetic Activation- Relaxation Technique (k-ART), an 
off-lattice, self-learning kinetic Monte Carlo algorithm with on-the-fly event search. Combining a 
topological classification for local environments and event generation with ART nouveau, an efficient 
unbiased sampling method for finding transition states, k-ART can be applied to complex materials 
with atoms in off-lattice positions or with elastic deformations that cannot be handled with standard 
KMC approaches. In addition to presenting the various elements of the algorithm, we demonstrate 
the general character of k-ART by applying the algorithm to three challenging systems: self-defect 
annihilation in c-Si (crystalline silicon), self-interstitial diffusion in Fe and structural relaxation in 
a-Si (amorphous silicon). 

PACS numbers: 02.70.-c, 61.72.Cc, 81.10.Aj, 61.72.J- 



I. INTRODUCTION 

Solid-phase diffusion in materials science and con- 
densed matter is dominated by rare atomic diffusion 
events associated with high-energy barriers as measured 
with respect to temperature. These stochastic pro- 
cesses take place on an extended time scale that makes 
them very difficult to reproduce using linear simulation 
schemes such as molecular dynamics. Low rates, how- 
ever, allow us to consider this series of processes as inde- 
pendent Markov chains. In this case, it is possible to ap- 
ply the kinetic Monte Carlo (KMC) algorithm proposed 
by Bortz et al. [1-4]. Based on transition state theory, 
KMC uses a catalog of pre-specified diffusion mechanisms 
to compute at every stage the exit rate from a local min- 
imum. The clock is then advanced using Poisson's law, 
a move is selected with the appropriate rate, and a dy- 
namical trajectory is constructed. While time steps in 
KMC are dominated by the lowest energy barriers, it is 
possible, under the right conditions, to simulate on ex- 
perimental time scales. 

Since it was proposed, KMC has been used extensively 
in materials science, condensed matter, and many other 
fields. Its well-known limitations have nevertheless pre- 
vented the method from being applied widely to com- 
plex systems. In particular, since KMC depends on a 
predefined event catalog, systems under study have to 
be discretized and atomic motion limited to fixed lattice 
positions [3]. In this way, it is possible to evaluate from 
the onset all possible moves that will be included in the 



* laurent. karim. beland@umontreal.ca 
T peter. brommer@umontreal.ca 
^ fadwa.el_mellouhi@qatar.tamu.edu 
§ jean-francois.joly. l@umontreal.ca 
^ normand. mousseau@umontreal.ca 



catalog. For relatively simple kinetics, such as metal-on- 
metal growth, these limits are not major hurdles. They 
become a problem when trying to take full account of 
the lattice associated with long-range elastic effects and, 
more importantly, off-lattice and disordered conforma- 
tions. 

Over the years, a number of algorithms have been pro- 
posed to lift, at least partially, these limitations. Most 
can be classified in one of two categories: the addition 
of a continuum approximation for computing the effects 
of long-range strain deformations on the energy barri- 
ers and on-the-fiy evaluation of energy barriers. The 
first class of methods adds long-range contributions, com- 
puted from a number of extrapolation schemes, to the 
energy barriers extracted from the predefined catalog of 
events [5, 6]. The second class relaxes the need for a 
predefined catalog and, in some cases, moves away from 
a lattice-based description. This is the case, for exam- 
ple, of the self- learning KMC approach by Trushin et 
al.^ which keeps lattice-based displacement, but intro- 
duces an on-the-fiy search for barriers [7]. To remove 
the constraints of a lattice-based description, other meth- 
ods construct a new catalog at each step to determine 
the next step, using various climbing methods such as 
the dimer [8, 9], eigenvector- following [10], or the au- 
tonomous basin climbing methods [11]. 

The first class of methods [5-7] remains limited to 
on-lattice positions in addition to providing often ill- 
controlled corrections to energy barriers modified by elas- 
tic deformation. The second class, with an on-the-fiy, 
off- lattice approach, is much more fiexible. However, it 
is inefficient as a catalog must be rebuilt at each step, 
making it costly to study complex systems with a large 
number of possible diffusion mechanisms [8-11]. 

In 2008, we introduced the kinetic activation- 
relaxation technique (k-ART), an on-the-fiy, off-lattice 
KMC method that lifts these limitations [12]. In this ini- 
tial work, we showed that, for a system of vacancies in Si 



at 500 K, it achieves significant speed-ups over standard 
MD, while retaining a complete description of the rel- 
evant physics, including long-range elastic interactions. 
More recently, Kara et at. proposed a similar self-learning 
kinetic Monte Carlo method based, however, on less con- 
trolled procedures for finding barriers and classifying off- 
lattice configurations [13]. 

Here, we present in detail the k-ART algorithm, which 
couples the activation-relaxation technique (ART nou- 
veau) [14, 15] for generating events and calculating bar- 
riers with NAUTY [16] for the topological classification 
of events. We also present improvements introduced 
for handling low-energy barriers and large systems. Fi- 
nally, we demonstrate the efficiency and versatility of the 
method by applying it to three systems: self-defect anni- 
hilation in crystalline Si, interstitial diffusion in Fe and 
relaxation in amorphous silicon. 



II. OVERVIEW OF KINETIC ART 



and stored in the catalog. 

Once the extensive search for events on all topologies 
is finished, relevant events for the current configurations 
are collected. All low-energy barrier events are relaxed 
for specific atoms, to include not only topological but 
also geometrical effects. At this point, following Bortz et 
al. [1], the elapsed time to the next event is computed as 
At = — In fij ^^ Ti where /i is a random number in the 
[0, 1[ interval and r^ is the rate associated with event i. 
The clock is pushed forward, an event is selected with 
the proper weight, and the atoms are moved accordingly, 
after a geometrical reconstruction. 

Once in this new configuration, the process starts 
again: the topology of all atoms belonging to the lo- 
cal environment around the new state is constructed; if 
an unknown topology is found, a series of ART nouveau 
searches are launched, otherwise, we proceed to the next 
step. After all events are updated, the low-lying barri- 
ers are, once again, relaxed before applying the KMC 
algorithm. 



Following standard KMC, the k-ART method uses an 
event catalog to compute the rate of escape from a local 
minimum and bring forward the simulation clock. There 
are three fundamental differences with respect to stan- 
dard KMC, however. First, discretization of the local en- 
vironment is done through topology instead of geometry, 
allowing atoms to adopt freely any spatial arrangement 
instead of being constrained to predefined lattice posi- 
tions. Second, the catalog is not fixed at the simulation 
onset, but grows as new local environments are visited, 
allowing the study of very complex systems. Third, event 
energy barriers are fully relaxed at each step to take into 
account all geometrical rearrangements due to short- and 
long-range elastic deformations. 

A simulation starts from an initial configuration re- 
laxed into a local energy minimum. The local topol- 
ogy associated with each atom is first characterized with 
NAUTY [16]. All atoms sharing a specific topology are 
presumed to be associated with the same list of activated 
mechanisms. This is the basic assumption of k-ART and, 
as we will see below, it can be made to hold on a per-atom 
basis. This approach results in a considerably reduced 
amount of generated and handled data. For a single va- 
cancy in crystalline silicon (c-Si), for example, one only 
needs to consider 20 different topologies to describe all 
local environments, irrespective of the system size. 

A search for activated pathways is launched for each 
topology using ART nouveau [14, 15]. ART nouveau 
was shown to identify efficiently relevant diffusion mech- 
anisms in systems described with both empirical and ah 
initio methods [17-20]. This method has been exten- 
sively characterized in Ref. [21]. A number of techni- 
cal improvements are also reported in Ref. [22] and new 
events can now be generated with as little as 300 force 
evaluations with either empirical or ah initio potentials. 
Each event is classified according to the initial minimum, 
the saddle configuration and the final state topologies 



III. ALGORITHMIC AND IMPLEMENTATION 
DETAILS 

A. ART nouveau 

The search for activated mechanisms is performed 
using ART nouveau [14, 15], an open-ended climbing 
method for finding first-order saddle points surrounding 
a local minimum. As the most recent version of the al- 
gorithm is described in Refs. [21] and [22], we give here 
only a brief overview of the method. Event search with 
ART nouveau proceeds in three steps: (1) starting from 
an energy minimum, the system is deformed locally in a 
random direction until the lowest curvature of the Hes- 
sian matrix becomes negative, indicating an instability; 
(2) the configuration is then pushed along this direction 
of negative curvature while the energy is minimized in 
the hyperplane orthogonal to this direction until the to- 
tal force falls below a set threshold, indicating that a 
first-order saddle has been reached; (3) the configuration 
is pushed over this point and is relaxed into a new mini- 
mum. This set of three configurations — initial minimum, 
saddle point, and final minimum — forms an event. 

Since activated processes are local in nature, each event 
is initiated by displacing a given atom and its neigh- 
bors in a random direction. The exact size for this dis- 
placed region depends on the system studied. In semi- 
conductors, it involves typically first and second nearest- 
neighbors. In the case of Si vacancies, for example, the 
displacements are applied to the central atom and all 
atoms within a 3.0 A radius from it. The initial con- 
vergence criterion for the saddle point is typically set to 
0.5 eV/A. This is associated with the generation of the 
generic event catalog (see below) . Further relaxation as- 
sociated with the calculation of specific events uses a 0.1 
eV/A threshold. 



To decrease computational cost, ART nouveau never 
computes the Hessian directly, but rather uses a mixture 
of Lanczos [23] and DIIS [24] methods for converging to 
the saddle point. As discussed in Ref. [22], less than 
300 force evaluations are generally needed to converge 
to a first-order saddle point. Taking into account all 
processes, including non-converging steps and relaxation 
into a new minimum, about 600-800 force evaluations are 
required, on average, per successful event search. Relax- 
ation of specific events, which starts near a reconstructed 
saddle point, are normally much faster, necessitating typ- 
ically 1 to 80 force evaluations. 
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B. Topological classification and event generation 

Topological classification of the local atomic environ- 
ment is a crucial step in k-ART, as it provides a means 
of discretizing and cataloging local configurations, while 
taking into account all possible atomic arrangements and 
elastic deformations. 

Atomic topologies for a given local configuration are 
computed as follows. We define a local environment con- 
sisting of all atoms within a sphere of a predefined radius 
centered around each atom, as illustrated in Fig. 1. These 
are then connected following a neighboring prescription, 
such as first neighbor distance cut-off or a Vorono'i tessel- 
lation, forming a truncated connectivity graph, that is, a 
set of bonds connecting vertices, without geometrical in- 
formation. This graph is then analyzed and classified 
using the freely available topological software nauty, 
developed by McKay [16]. This software package pro- 
vides the topology index and all information necessary 
for uniquely identifying each environment, including the 
permutation key needed to reconstruct a specific geome- 
try from the generic topology and a set of reference po- 
sitions. The topology index from NAUTY has the form of 
an ordered set of three integers, which we use as input 
to a hash function to generate a unique topology label 
(hash key). 

The geometrical reconstruction from a purely topolog- 
ical graph is made possible because we know the atomic 
positions of all atoms surrounding the local configura- 
tion described by this graph. This introduces sufficient 
constraints to ensure that most of time, as discussed be- 
low, a given graph corresponds to a unique fully relaxed 
geometry. 

The neighboring prescription and the size of the trun- 
cated region are selected to ensure that, in most cases, the 
configuration is uniquely defined through this network, 
that is, the connectivity graph must lead to a unique 
structure once relaxed with a given interatomic poten- 
tial. In the case of crystalline Si, for example, we define 
the local environment around an atom by a sphere of ra- 
dius 5.0 A, which includes about 40 atoms; two atoms 
are linked if their distance is less than 2.8 A. 

Once all new topologies are identified, a succession of 
ART nouveau searches are launched on each of them. 



FIG. 1. (Color online) Local topology analysis procedure in 
k-ART. A truncated graph (b) is extracted from the complete 
lattice (a). This graph is analyzed through nauty (c), which 
returns a unique key and the associated topology (d). 



The optimal number of searches done per topology de- 
pends on the nature of the system and can be adjusted 
to ensure that all low-energy barrier events, which domi- 
nate the dynamics, are found. In our current implemen- 
tation, it is increased also with the number of times a 
given topology is seen, ensuring that the most common 
topologies are explored more often. 

Every new event found with ART nouveau is analyzed 
and compared to the list of already known events for that 
initial topology. If an event with the same activation en- 
ergy is already in memory, a series of tests are performed 
to assess whether or not it is the same event. When the 
generated event is judged to be new, it is then assigned to 
the topology centered on the atom that moved the most 
during the event, irrespective of the initial activation. 
The event label is obtained by combining the topology 
labels at the initial, saddle, and final states. We also keep 
in memory the position of the cluster of atoms for the ini- 
tial, saddle, and final configurations. These are necessary 
(a) as a reference for the geometrical reconstruction from 
the saddle or the final configuration topological mapping 
for atoms belonging to the same topological class and (b) 
to compare the event in the list with newly generated 
events. Once a generic event is added to the database, 
it is also added to the binary tree of events and to the 
histogram (see Sec. IIIC). 

Finally, based on these data, a first generic rate is as- 
sociated with the event by setting: 



ri = TQe^Y^{-AEi/kBT), 



(1) 



where tq, the attempt frequency, is fixed at the onset and, 
for simplification, assumed to be the same for all events 
(10^^ s~^). AEi is the barrier height, that is, the energy 
difference between saddle point and initial minimum, ks 
and T are, as usual, the Boltzmann constant and the 
temperature, respectively. 

While using a fixed attempt frequency is a simplifica- 
tion, it has been shown in several systems that tq varies 



only weakly with the chosen pathway [25]. The value 
of 10^^ s~^ is compatible with pre-exponential factors in 
iron derived from experiment [26] and in simulation [27]. 
For silicon, this value is also compatible with ab initio 
computations of neutral vacancy diffusion [19]. 

In certain symmetric configurations, it is possible that 
distinct events associated with a given atom have iden- 
tical topologies of initial state, saddle point, and final 
state, and the same barrier height and absolute displace- 
ment. To distinguish between those and fully account for 
the symmetries, we also include the direction of atomic 
motion in the description of events. Checks on a num- 
ber of highly symmetric states in Si and Fe show that 
this classification recovers all pathways and accurately 
discriminates between them. 

Hash keys provide a fast way of storing and retrieving 
event or topological information. If the key of a new event 
or topology is already in use (a so-called hash collision), 
the key value is incremented by one until a free key is 
found. The arrays for events and topologies account for 
most of the memory used in k-ART simulations, and so 
a balance has to be struck between size and speed. Too 
large arrays waste RAM, while too small ones provoke 
frequent hash collisions and fill up earlier. 

One of the major advantages of this approach is that 
this constantly updated catalog of generic events and 
topologies can be saved to the disk, made available to 
others, and reused on the same and similar systems. Mul- 
tiple catalogs from different k-ART simulations can also 
be merged to create a larger database that can be used 
to start new simulations on the same system with a sig- 
nificant speed increase. 



Adaptive generic event relaxation into specific 
events 



Events generated for a given topology are known as 
generic events. It is assumed that all atoms sharing the 
same topology will have access to these events with, how- 
ever, a small adjustment to the energy barrier due to local 
variations in position or long ranged elastic interactions. 
To take these changes into account, generic events with 
low barriers are re-converged for each realization, result- 
ing in specific events^ each linked to a particular atom. 

Starting from a common topological generic event, a 
specific event is generated for each atom having the same 
topology by taking advantage of the ordered list of clus- 
ter atoms around the central atom obtained using nauty 
(permutation key). With this information, it is possible 
to reconstruct the geometry of the specific saddle and 
final minimum conformations by mapping the displace- 
ment vector and transforming geometrically a given re- 
gion in the system with respect to the generic configura- 
tion. 

In an earlier version of the algorithm [12], specific 
events were identified as those belonging to generic 
classes with a an energy barrier of 15 /c^T or less. Here, 
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FIG. 2. (Color online) Generic events stored in a histogram 
used for an adaptive event relaxation procedure. 



we adopt rather an adaptive algorithm based on the ki- 
netics of the system. All generic events are ordered and 
stored into slices of 0.1 eV according to their energy bar- 
rier in a histogram (see Fig. 2). For each slice, the total 
rate is computed and the cumulative rate up to each en- 
ergy barrier range is calculated. 

The relaxation into specific events starts from the bot- 
tom of the histogram and then proceeds to higher energy 
barriers until events accounting for a large fraction of the 
total rate are relaxed (we use 99.9 % here). The remain- 
ing generic events are copied to unrefined specific events. 
In this way, we ensure that almost all selected moves will 
be picked from the list of fully relaxed individual barri- 
ers. Geometric and elastic effects on the energies of local 
minima and transition states are therefore fully included 
in the rates. 



D. Crucial aspects 

1. Uniqueness in the correspondence between a geometry 
and topology 

One of the main k-ART advantages resides in the 
topology-based discretization. To be valid, this classi- 
fication requires a unique correspondence between a lo- 
cal geometry and a topology. With the right building 
rules for the truncated graph, it turns out that this ap- 
proximation almost always works for covalently bonded 
systems, but also for metals. Since k-ART relies on the 
delicate reconstruction of transition states for computing 
specific events, a failure of the relation between topology 
and geometry leads systematically to the disappearance 
of this first-order saddle point and can be easily detected. 
In other words, event reconstruction automatically fails 
when a given topology corresponds to more than one ge- 



ometry. 

When an ill-defined topology corresponding to more 
than one geometry is encountered for the first time, k- 
ART automatically adjusts the atom- atom connectiv- 
ity cutoff used within the cluster of atoms forming this 
graph, until the geometries initially associated with the 
same topology are now placed into two different classes. 
The unique correspondence between local geometry and 
topology is now re-established. 

For computational ease, a marker identifies already en- 
countered difficult topologies so that k-ART can recog- 
nize them on the ffy and test them to ensure that the 
correct event catalog is used. To do so, k-ART maps one 
event from each of the associated sub-topologies, until a 
successful map is achieved. If no event can be mapped 
using the current sub- topologies, a new label is again 
issued. Each difficult topology can have as many sub- 
topologies associated with it as necessary. In practice, 
topology collisions are extremely rare, if reasonable clus- 
ter and neighbor cutoffs are used. If there is a collision, 
it can usually be resolved with only one sub-topology. 



2. Handling low- energy barriers 

In KMC, dynamics is dominated by a system's low- 
est barrier energy. When the energy landscape consists 
of basins with numerous states connected by very low- 
energy barriers compared to those needed to leave these 
basins, the algorithm becomes trapped into computing 
non-diffusive events, decreasing significantly its efficiency 
in two ways. First, it limits the attainable simulated 
time, as the low-energy internal barriers produce a high 
total rate sum and thus a short average KMC time in- 
crement. Second, computational resources are bound to 
explore the states within a basin without yielding much 
information, as effective diffusion takes place typically 
outside the energy basins. 

We had previously implemented a TABU-like ap- 
proach [28], that bans transitions rather than states 
[29]. This algorithm is simple to implement and pro- 
vides a thermodynamical solution when there is a clear 
energy separation; it fails, however, when few path- 
ways are available or the energy spectrum is continu- 
ous. To account for these situations, we developed the 
basin-autoconstructing mean rate method (bac-MRM), 
a basin-based acceleration scheme inspired by the mean 
rate method (MRM) of Puchala et al. [30]. A descrip- 
tion of MRM can be found in the Appendix. In summary, 
MRM separates the trajectory into transient states and 
absorbing states, and accelerates the simulation by av- 
eraging over all possible jumps between transient states, 
yielding the correct probability to exit a basin to a cer- 
tain absorbing state. 

In kinetic ART, the relevant entities are not states, 
but events^ characterized by an energy barrier between 
the initial and final states. In an event with energies 
Ei^ Es^ and Ef at the initial state, the saddle point. 



and the final state, respectively, we define the forward 
energy barrier as hf = Eg — Ei^ and the inverse barrier 
by hi = Eg — Ef. In both cases, rates going forward or 
backward are determined by Eq. (1). Basins are then 
identified on the fiy by the barrier heights separating the 
basin states: Both the forward and the inverse barrier 
must be smaller than a user-defined threshold. 

Starting in a local minimum, low-energy barrier events 
are marked as potential basin events, and the atomic dis- 
placement associated with these events is stored. If such 
an event is picked for execution, it is added to the current 
network of basin events (i.e., for the first basin event, this 
is at that time the only event), removed from the tree of 
available events, and executed as normal. The system is 
then in state two, and the k-ART event finding algorithm 
is started from this state. All events from previous basin 
states are kept in the tree and could be picked as KMC 
move. 

After state two has been searched for possible events, 
and before the next event is picked, the MRM is applied 
to the basin consisting of two states: The rates leaving 
the basin are modified following Eq. (A. 4), and the to- 
tal rate is adjusted accordingly. The next step is then 
selected. It can either lead to a new basin state, to be 
added according to the procedure described above, or out 
of the basin, in which case a standard k-ART move is ap- 
plied. If an event is found to lead to an already explored 
basin state, it is rejected, removed from the tree, and 
added to the basin, adjusting the rates as needed. 

Bac-MRM explores basins on the fiy, and only as far as 
necessary. Simultaneously, no state is intentionally vis- 
ited twice. While the internal dynamics within a basin is 
lost, the basin mean rate method in our implementation 
yields the correct distribution of exit states depending on 
the basin internal rates and the point of entry into the 
basin. 

The computational overhead of bac-MRM is small and 
the CPU resources needed for all basin related operations 
are negligible compared to the time required to explore 
a single topology. 



3. Optimizing k-ART for large scale systems 

Because activated mechanisms are local in nature, it is 
relatively straightforward to optimize k-ART to handle 
systems with several tens of thousands of atoms. Indeed, 
diffusive motion typically involves regions composed of 
a few tens to a few hundreds of atoms, and the forces 
induced by this displacement typically propagate up to 
a few nanometers. Therefore, by coupling standard cell 
lists [31] and the Verlet algorithm [32] for constructing 
neighbor lists with a local force calculation, the com- 
putational effort of generating an event becomes almost 
system-size independent. 

Local forces are first computed on all atoms involved 
in the event plus their first and second neighbors. As 
the system evolves, atoms on which the force exceeds a 
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set threshold (of 0.01 eV/A, in the case of c-Si) become 
labeled as involved and their first and second neighbors 
are added to the list. This process ensures that forces 
are computed only on the relevant atoms. The genera- 
tion of new events and the relaxation of specific events 
therefore is entirely local, with only a global minimization 
performed after each KMC move to take into account all 
elastic effects. 

We simulated several systems composed of 2744 to 
27000 c-Si atoms using an empirical Stillinger- Weber po- 
tential, periodic boundary conditions, and roughly the 
same density of vacancies and interstitials in equal num- 
bers. The scaling is given as a function of system size in 
Fig. 3 for exploring a new topology on a single processor. 
Because of rare global calculations during the analysis 
of events, the algorithm has a weak linear dependance 
of 0.03 s per atom. However, the sub-linear terms, re- 
sponsible for about 600 s, are dominant for system size 
of several tens of thousands of atoms. These sub-linear 
terms are associated to the time required to attempt 15 
saddle point searches on one topology and to analyze the 
results as computed on a single core of a 2.66GHz Intel 
Xeon X5550 CPU (all CPU times reported in this paper 
are computed on the same CPU). 



IV. APPLICATION OF K-ART 

We now apply k-ART to three different systems, in 
order to demonstrate the flexibility of the method: (a) 
Vacancies and interstitials in c-Si. This work expands on 
the vacancy diffusion study presented in Ref. [12]. (b) In- 
terstitials in Fe. Diffusion mechanisms for self-interstitial 




FIG. 4. (Color online) The initial state of our 8000-atoms c-Si 
box containing eight vacancies and eight interstitials. We only 
show over-coordinated (small spheres) and under-coordinated 
atoms (large spheres). 



in iron are surprisingly complex; this system represents 
a good test of k-ART's ability to sample such landscape. 
(c) Relaxation of amorphous silicon (a-Si). By construc- 
tion, KMC methods have been mostly limited to lattice- 
based problems; here, we show that the topological ap- 
proach of k-ART is sufficiently flexible to handle disor- 
dered materials. 



A. Vacancies and interstitials in Si 

We studied the annealing of eight pairs of vacancies 
and interstitials in an 8000 atoms c-Si box at 500 K using 
the Stillinger- Weber potential [33] and periodic bound- 
ary conditions. A snapshot of the initial state of the 
simulation is shown in Fig. 4. With this interatomic po- 
tential, the diffusion activation barrier for an individual 
vacancy is about 0.43 eV [34]. Interstitials can be stabi- 
lized in three states, two of which are almost degenerate 
in energy at about 0.75 eV above the first one. Diffusion 
for the single self-interstitial is dominated by a barrier at 
0.94 eV [35]. 

Figure 5 (bottom) shows the evolution of the total en- 
ergy, measured with respect to the crystalline state, and 
the squared displacement as a function of simulated time 
for the system above, representing a total of 2000 k-ART 
steps. Vacancies dominate the diffusion with significantly 
lower energy barriers. Each of the five large drops in 
energy corresponds to the annihilation of an interstitial- 
vacancy (IV) pair, while from roughly 1 /is on, bound de- 
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FIG. 5. (Color online) Simulation of eight IV pairs in an 800- 
atom box at 500 K. Top: The number of encountered topolo- 
gies and the computation time as a function of the simulation 
time. Bottom: The evolution of the total energy, measured 
from the perfect crystal, and of the the squared total displace- 
ment as a function of the simulation time. The zero on the 
energy scale corresponds to a box with no defect. Arrows indi- 
cate important interstitial- vacancy annihilation states shown 
as snapshots in Figs. 6 and 7. 
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FIG. 6. (Color online) A typical interstitial- vacancy recombi- 
nation. Time are measured from the initial snapshop, taken at 
0.2 /is and indicated by arrow B in Fig. 5. Over-coordinated 
atoms are shown as small spheres and under-coordinated 
atoms are shown as large spheres. The empty figure in (g) 
indicates that there are no more topological defects in the 
local environment. 



fects reorganize themselves without any recombination. 
We show in Fig. 6 an example of a typical isolated re- 
combination. We observe several metastable states for 
the bound pair, before recombination, in general agree- 
ment with the findings of Tang et al. [36] and Marques 
et al [37]. 

Energy fluctuations in Fig. 5 are also associated with 
the formation of self-defect aggregates, such as a bi- 
interstitial and a vacancy complex, as well as elastic de- 
formations. These can cause important differences be- 
tween IV-pair recombinations. This is the case of the 
second IV recombination (at t = 0.08 /is), which dif- 
fers markedly from the four others. Indeed, the presence 
of nearby vacancies (at a distance of about 1 nm from 
the main interstitial and vacancy) significantly modifies 
the relaxation mechanism. They introduce an interme- 



diate oscillatory state which flickers for a duration of 
nearly 45 ns (see Fig. 7) before IV recombination. This 
metastable state underlines the importance of correctly 
handling long-range elastic and topological effects for the 
defect kinetics in semiconductors. 

Even small changes in the environment, up to 20 A 
away from the defect, can change the kinetics of defect 
diffusion and recombination. For example, IV-pair re- 
combination, for a nearly, but not completely isolated 
pair can follow at least two pathways, with barriers dif- 
fering significantly: 0.45, 0.39, 0.19, and 0.20 eV, in the 
first case, which is in good agreement with Gilmer et 
al. [38], and 0.36, 0.19, 0.22, and 0.51 eV, in the second. 

The advantage of k-ART's approach to catalog build- 
ing can be seen in Fig. 5 (top). Since only previously seen 
topologies are included, the catalog grows as new regions 
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FIG. 7. (Color online) Oscillation of an interstitial-divacancy 
complex during an interstitial-vacancy recombination in the 
presence of nearby vacancies. Time is measured from the 
initial snapshot, taken at 50 ns and indicated by arrow A in 
Fig. 5. Over-coordinated atoms are shown as small spheres 
and under-coordinated atoms are shown as large spheres. 



of the configurational space are visited, avoiding the need 
to construct all possible conformations from the onset, a 
task that would rapidly be impossible, even for a system 
with 16 defects. We see that as the defects first diffuse in 
isolation for the first 20 ns, very few new topologies are 
encountered. It is only as the recombination processes 
take place, between t = 20 ns and 1 /is, that many new 
environments are visited, increasing rapidly the number 
of sampled conformations from a few hundreds to almost 
20 000 topologies. While this number is large, all events 
are stored in a catalog and serve as such in any new sim- 
ulation, reducing considerably the CPU cost over time. 



B. Self-interstitial cluster diffusion in iron 

Iron is widely used in nuclear power plants, which 
makes the modeling of irradiation-induced defects inter- 
esting, in particular single and clustered self-interstitial 
atoms (SI A) [39], from the microscopic to large scale, 
an important topic in computational materials science 
[40, 41]. 

Simulating SIAs on an atomistic level is challenging 
for both molecular dynamics (MD) and standard KMC 
methods. On the one hand, the activation energies of de- 
fect migrations are rather high, so that MD simulations 
must be performed at comparatively high temperatures 
(up to 1200 K, cf. Refs. [42, 43]) to explore the energy 
landscape within the accessible simulation times. This 
makes it difficult to identify structures and mechanisms 
important at the significantly lower operating temper- 
atures. On the other hand, due to the wealth of ar- 
rangements of interstitial atoms and other defects, it is a 
formidable task to build a catalog of possible transitions 
a priori (to use in a KMC simulation), without acci- 



dentally neglecting important migration paths. Often, a 
very reduced catalog of transition pathways is used [44] . 
All this is complicated by the fact that in iron clusters 
of interstitial atoms can glide in one dimension with very 
small migration energies (tens of milli-electron volts) [42] . 

K-ART is an ideal tool to explore the migration path- 
ways of SIAs without any assumptions needed to assem- 
ble a catalog of events and the associated barriers a pri- 
ori. Events are searched for on the fly for each topol- 
ogy in the system, with corrections applied at each KMC 
step to account for elastic distortions by surrounding de- 
fects. As outlined in Sec. HID 2, low barriers can also be 
treated efficiently with the basin-autoconstructing mean 
rate method. 

For our iron simulations, we used the Ackland- 
Mendelev potentials [45], an improved version of the 
potential developed by Mendelev et al. [46]. This po- 
tential describes defects accurately [47] and was used in 
MD [42, 43] and ART nouveau [21] simulations of iron 
SI A systems. The EAM energy and force calculation rou- 
tine was adapted from IMD [48] and can use any tabu- 
lated potential in IMD form. 

To test the implementation, a single self-interstitial 
atom was embedded in a 1024-atom supercell. In 
the ground state, the interstitial forms a dumbbell in 
[110] direction in agreement with earlier static simula- 
tions [49] and ART studies [21]. The transition with 
the lowest energy barrier follows the nearest neighbor 
(NN) translation-rotation mechanism proposed by John- 
son [50] with an activation barrier of 0.3 eV. Higher bar- 
rier events include a transformation to the [11^] dumb- 
bell, followed by an on-site rotation, then pure transla- 
tions to first and second NN sites. Other higher-energy 
events were found, but were not picked to be executed 
during our simulations. 

In simulations with two clustered inter st it ials, k-ART 
recovers the mechanism for interstitial migration sug- 
gested by Johnson [50]: Both interstitials each perform 
a nearest-neighbor translation-rotation jump. This can 
happen in a single step, or in two sequential moves. The 
states and barriers found in our simulation agree with the 
results from Marinica et al. [21]. A sample trajectory of 
the di-interstitial system over 2000 KMC steps is shown 
in Fig. 8. 

For a single self-interstitial, at the temperature of in- 
terest, barriers are well above /c^T, and there are no 
flickers. For SI A clusters, however, there are sequences 
of states separated by low barriers. In the di-interstitial 
case, Marinica et al. [21] find a basin of 0.25 eV above 
the ground state, with barriers below 0.1 eV separating 
the states. We reproduce that basin as demonstrated in 
a detail of the trajectory in Fig. 9. While the system 
explores the basin, low-energy barriers keep the system 
clock almost at a standstill. 

In a system with a 4-SIA cluster, a basin is found 
around the ground state: Small reorientations of the four 
dumbbells lead to about 20 unique configurations sep- 
arated by extremely low barriers (<0.1 eV). Since the 
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FIG. 8. (Color online) Simulation of an iron di-interstital 
cluster in a 1024-atom box at 300 K. Top: The number of en- 
countered topologies and the computation time as a function 
of the simulation time. Bottom: The evolution of the total en- 
ergy, measured from the ground state, and of the the squared 
total displacement as a function of the simulation time. The 
system propagates mainly by the two-step Johnson process 
with an activation energy around 0.33 eV. The simulation 
was performed using a pre-built catalog constructed from an 
earlier simulation, initially containing information about 9074 
topologies. The simulation stalls when many new topologies 
must be explored. The arrow marks the basin shown in detail 
in Fig. 9. In that basin, the activation energy is under 0.3 eV. 
Two more basins are encountered at 3.8 and 5.5 /xs. 



dynamics is dominated by low-barrier events, the system 
manages to exit the basin only when all those states have 
been explored. A sample trajectory over 290 KMC steps 
is displayed in Fig. 10. 

The wealth of structures and transition pathways 
found in iron systems with SIA clusters is virtually im- 
possible to include in a catalog assembled a priori for 
standard KMC simulations. In contrast, the self- learning 
k- ART program will over the course of a simulation build 



0.6 



> 0.4 





Energy 
Simulated time 



1005 




350 360 370 380 390 400 410 
KMC step 

FIG. 9. (Color online) Detail of Fig. 8 (KMC steps 350-413, 
0.99 /is): After a series of two-step Johnson jumps, the system 
crosses into an excited state at KMC step 235. Then a number 
of states forming a basin are traversed. Shaded background 
indicates a basin (oscillatory) motion. After an exit event, 
the system resumes its basin trajectory, until another exit 
event leads it back to the ground state, from where it resumes 
its two-step motions. The energy trajectory passes through 
minima and saddle points alternately (minima marked by 
crosses). During the basin motion, the system clock is hardly 
moving. 



a database of these configurations and events, thus sav- 
ing time once the system revisits previous states. The 
presence of basins (i.e. groups of states separated by low 
barriers) in these systems dictates the use of an acceler- 
ation scheme. With bac-MRM, even such a rich system 
becomes a tractable problem in KMC simulation. 



C. Relaxation dynamics in amorphous silicon 

As kinetic ART is an off-lattice method with with self- 
learning catalog building capabilities, it can also be used 
to study relaxation of disordered materials on long time 
scales. There have been previous applications of similar 
techniques to these systems, but these have suffered from 
limited sampling of events [10, 51]. 

As a test case, we looked at amorphous silicon. Like 
crystalline silicon, this allotrope of silicon is fourfold co- 
ordinated with randomly oriented tetrahedra causing the 
loss of medium and long range order in the system. 
This model system has been extensively studied with 
ART [14, 52] and ART nouveau [18, 53] and constitutes 
therefore a well-controlled model. Moreover, many fun- 
damental questions remain regarding its dynamical prop- 
erties. For example, in spite of considerable experimental 
efforts, the exact nature of defects responsible for struc- 
tural relaxation is still a matter of debate [54-57]. As 
for many other disordered systems, only methods able 
to reach experimental timescales will be able to offer a 
satisfactory answer to these questions. 
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FIG. 10. (Color online) Trajectory of a tetra-interstitial clus- 
ter at 300 K in a 1024-atom cubic box over 290 KMC steps. 
Top: The simulation and the computation time as a function 
of KMC steps. Bottom: The evolution of the total energy, 
measured from the perfect crystal, and of the the squared 
total displacement as a function of KMC steps. The system 
makes several attempts to leave the ground state basin, but 
falls back until, at KMC step 234, it succeeds. It then moves 
through a sequence of excited states, before dropping back to 
a different ground state basin, with the whole cluster diffus- 
ing to the nearest neighbor site. Different background colors 
represent different basins (white: outside of basin). As the 
barriers (shown as impulses in the lower plot) are compar- 
atively low, the clock advancement is rather small. Only if 
a barrier exceeding the basin threshold is picked, the KMC 
time step is noticeable. A significant share of the CPU time is 
spent exploring the sequence of excited states between steps 
234 and 250. 



We start here with a well-relaxed 1000-atom a-Si 
configuration with periodic boundary conditions gener- 
ated with the modified Wooten- Winer- Weaire procedure 
[58, 59] and a reparametrized version of the popular 
Stillinger- Weber potential by Vink et al. [60] adjusted 
to describe appropriately this allotrope. All atoms in the 
generated a- Si sample are perfectly coordinated with a 
clean electronic gap [61] and a good agreement with the 
experimental radial distribution function [59]. 

For a disordered system, the advantage of recycling 
events based on the local atomic topologies takes a lot 



FIG. 11. (Color online) Simulation of a-Si in a 1000-atom box 
at 300 K. Top: The number of encountered topologies and the 
computation time as a function of the simulation time. Bot- 
tom: The evolution of the total energy and of the the squared 
total displacement as a function of the simulation time. The 
simulation was started with a catalog from an earlier simula- 
tion. The system flickers between two neighboring states until 
it finds a way to relax further. This leads to a sequence of con- 
figurations never seen before and the CPU time needed per 
step increases with the number of new topologies to explore. 



of time before becoming noticeable. For a well-relaxed 
1000-atom model of a-Si, for example, no two atoms share 
the same topology and even after many thousands of 
events, topologies encountered more than once are rare. 
A meaningful catalog requires therefore the combination 
of many independent KMC trajectories started from var- 
ious initial configurations. 

At first, since each atom has its own topology, the 
number of initial events to be generated is very large. 
Successive steps tend to be much less expensive and the 
number of new topologies per step depends strongly on 
the amplitude of the displacement during the previous 
KMC step. Small displacements observed during flickers 
usually result in less than 10 new topologies while diffu- 
sive events can generate up to 140 new topologies in a 
single step. 

The distribution of activation barriers is similar to the 
one found in previous ART studies [18, 53]. Although it 
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is continuous over a wide range of activation energies, the 
kinetics is dominated by non-diffusive low energy barri- 
ers. Contrary to crystalline systems, where a clear energy 
threshold separates diffusive from non-diffusive events, it 
is necessary to fix the basin-threshold somewhat arbi- 
trarily in the mean-rate method. Here we chose a cutoff 
of 0.3 eV for a simulation temperature of 300 K. There- 
fore, events associated with timescale of 16 ns or less are 
averaged over and the internal dynamics of these events 
is ignored. This is acceptable, as we are interested in 
simulations on the time scale of microseconds or more. 

Figure 11 (bottom) shows the evolution of the total 
configurational energy as a function simulated time for a 
simulation of 3360 k-ART steps. Since we started from 
a very well-relaxed configuration, the proportion of flick- 
ers is important but the system still manages to relax 
by more than 6 eV over a 12 /is simulation. While the 
initial relaxation with the modified Stillinger- Weber po- 
tential leaves the system perfectly coordinated, an aver- 
age of 0.8 at.% defects are created in a few KMC steps. 
This concentration is relatively constant throughout the 
simulation. Moreover, almost all the low barrier events 
involve coordination defects. Defect migration events are 
hard to characterize since local atomic motions can af- 
fect the existence of low energy defects up to the third 
nearest neighbor distance. This can cause some defects 
to disappear while creating new ones. 

The 4 eV drop at 10 /is is initiated by a bond switch- 
ing event of two four- fold coordinated neighboring atoms 
with a barrier of 0.28 eV. This allows for one atom to get 
rid of a highly strained bond, resulting in a energy drop 
of 0.84 eV. This event is then followed by a succession 
of 84 smaller relaxation events involving mostly sponta- 
neous creation or destruction of low-energy coordination 
defects. The configuration eventually ends up in a lower 
energy basin where flickers again dominate. The average 
defect population goes from 0.8 to 1.4 at.% during the 
entire process. 

K-ART in a-Si can be compared, on short times, with 
molecular dynamics. Using the MD software LAMMPS 
[62] with our a-Si model, we launch a 10-ns simulation 
at 300 K with a timestep of 2 fs. At regular intervals, a 
configuration is frozen and relaxed into the nearest local 
minimum using steepest descent in order to compare with 
our k-ART simulation. Results (not shown) confirm that 
almost no deformation takes place on this short time scale 
and both MD and k-ART display atomic displacement of 
the same amplitude. 

Total simulation time for this system is significant and 
Fig. 11 (top) shows the evolution of simulation time as a 
function of computer time for a code running on a single 
2.66 GHz Intel Xeon X5550 CPU starting from a pre- 
constructed catalog. The use of a parallel version of the 
algorithm coupled with a more extensive catalog is ex- 
pected to reduce considerably the computational efforts 
for this simulation. Already, however, we see that k-ART 
can be a useful tool for these complex systems. 



V. CONCLUSION 

In this paper, we present in details the kinetic ART 
algorithm, a versatile self-learning on-the-fly off-lattice 
kinetic Monte Carlo method. This method couples ART 
nouveau [15], a very efficient non-biased open-ended algo- 
rithm for finding transition states [21, 22], with a topolog- 
ical classification of events based on nauty, a powerful 
packaged developed by McKay [16]. 

Kinetic ART constructs a reusable event catalog that 
improves the efficiency of the algorithm over time. Events 
are stored as generic events coupled to a given topology. 
To fully include elastic deformations, the lowest-energy 
barriers are separately relaxed to specific events to fully 
account for geometrical and elastic deformations. By 
construction, the algorithm also automatically identifies 
cases when the topology does not correspond to a sin- 
gle geometry, ensuring that the basic approximations are 
valid for all events. For efficiency, k-ART also includes 
local force calculations, allowing sub-linear scaling with 
system size, and an exact handling of flickers extended 
from the mean-rate method [30]. Other acceleration tech- 
niques, such as parallel handling of event relaxation and 
generation, are also implemented in the current version 
of the k-ART package. 

To demonstrate k-ART 's versatility, we applied the al- 
gorithm to three problems: vacancy-interstitial annihila- 
tion in c-Si, interstitial diffusion in Fe, and relaxation of 
a-Si. Clearly, the algorithm, although slower than stan- 
dard KMC, can handle accurately complex systems with 
many tens of thousands of topologies much faster than 
MD, opening the possibility of studying problems that 
have long remained out of reach of simulation. 



Appendix: The Mean Rate Method 

Following Puchala et al. [30], the system is separated 
in transient states and absorbing states. To determine 
the probability to exit the basin to state x, we calculate 
the transistion probability matrix T, with components 



R. 



1^3 



E/c^i 



■rlR, 



^j' 



(A.l) 



where Ri^j is the rate going from basin state i to basin 
state j, and the summation is over all basin and exit 
states k. r/, the reciprocal of the sum of all rates leaving 
state i, is the mean residence time in state i each time it 
is visited. The occupation probability vector of all basin 
states after in-basin jump m (and before m + 1), 0{m) 
is thus given by repeated application of T to the initial 
occupation probability 0^(0) = ^^g, where s is the start- 
ing state. The sum of the occupation probabilities over 
all possible number of jumps gives the average number 
each basin state is visited: 



0sum _ ^ T^a(o) = (1 - T)-^0(o), 



(A.2) 



m=0 
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from which the mean residence time in basin state i be- 
fore leaving the basin can be calculated: 



r^e^ 



(A.3) 



These residence times are then used to accelerate the 
basin exit rates from basin state i to exit state j according 
to 



{R^ 



-^Jl 






(A.4) 



with k summing over all basin states. The next KMC 
step is then determined using standard KMC rules, using 
these accelerated rates. 

In contrast to the first passage time analysis (FPTA) 
[30, 63], the mean rate method is computationally much 
simpler, as it requires a single matrix inversion to calcu- 
late the modified rates, after which the ordinary KMC 



rules apply. This comes at a cost: There is no corre- 
lation between the randomly determined residence time 
and the selected exit state. Puchala et al. find [30] that 
in measuring average quantities after many steps, both 
MRM and FPTA yield the same results. 
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